Skip to content

feat(observability): emit structured idempotency events on all 4 claimDelivery outcomes#242

Merged
chrisleekr merged 3 commits into
mainfrom
fix/issue-232
Jun 20, 2026
Merged

feat(observability): emit structured idempotency events on all 4 claimDelivery outcomes#242
chrisleekr merged 3 commits into
mainfrom
fix/issue-232

Conversation

@chrisleekr

@chrisleekr chrisleekr commented Jun 20, 2026

Copy link
Copy Markdown
Owner

What changed and why

claimDelivery had four terminal outcomes but only one emitted a queryable structured event field (the legacy dedup-skip). The claim-won happy path was completely silent, so the dedup hit-rate denominator and the fail-open rate were not queryable from structured logs. Tracked as issue #232.

Behaviour is unchanged — logging-only. The SET-NX-EX command, TTL, return values, control flow, and fail-open semantics are byte-for-byte identical.

Changes

  • src/webhook/idempotency-log-fields.ts (new): z.discriminatedUnion("event", …) schema over the three-event family, mirroring src/utils/retry-log-fields.ts. Not a runtime validator on the hot path; it is the drift-prevention contract the co-located test parses each captured emit through. deliveryId stays camelCase (the established repo-wide child-logger ID binding); new metric fields use snake_case.
  • src/webhook/idempotency.ts: claimDelivery now emits a uniform event on all 4 outcomes:
    • idempotency.claimed — info, new (was unlogged); SET-NX won, caller proceeds.
    • idempotency.duplicate_skipped — info; replaces the legacy "dedup-skip" literal.
    • idempotency.failed_open reason: "unavailable" — warn; Valkey unconfigured/disconnected (no SET issued).
    • idempotency.failed_open reason: "error" + err — warn; the SET threw.
  • test/webhook/idempotency-log-fields.test.ts (new): strict-schema drift tests (accept well-formed events; reject unknown literals, extra fields, invalid/missing reason, empty deliveryId).
  • test/webhook/idempotency.test.ts: upgraded to a recording logger; each captured emit is parsed through IdempotencyLogFieldsSchema to prove emit/schema agreement (closing the drift hole a loose string check leaves open). Return-value + SET-arg assertions preserved.
  • docs/operate/observability.md: new ## Idempotency log fields section (events, levels, fields).

Flow

flowchart TD
    CDEntry["claimDelivery called"]:::gate
    CheckValkey{"Valkey available<br/>and healthy?"}:::gate
    DoSet["SET key NX EX 259200"]:::gate
    ErrThrown{"SET threw?"}:::gate
    ResOK{"result is OK?"}:::gate
    WarnUnavail["warn idempotency.failed_open<br/>reason unavailable, return true"]:::ev
    InfoClaimed["info idempotency.claimed<br/>return true"]:::ev
    InfoSkipped["info idempotency.duplicate_skipped<br/>return false"]:::ev
    WarnError["warn idempotency.failed_open<br/>reason error plus err, return true"]:::ev
    CDEntry --> CheckValkey
    CheckValkey -- no --> WarnUnavail
    CheckValkey -- yes --> DoSet
    DoSet --> ErrThrown
    ErrThrown -- yes --> WarnError
    ErrThrown -- no --> ResOK
    ResOK -- yes --> InfoClaimed
    ResOK -- no --> InfoSkipped
    classDef ev fill:#196f3d,color:#ffffff
    classDef gate fill:#2c3e50,color:#ffffff
Loading

Test plan

  • test/webhook/idempotency-log-fields.test.ts — strict-schema drift (extra field / unknown event / bad reason / empty deliveryId rejected)
  • test/webhook/idempotency.test.ts — all 4 outcomes asserted; each emit parsed through the schema; return-value + SET-arg checks kept
  • dedup-skip literal retired (no remaining reference in src/)
  • typecheck / lint / format / docs:build clean
  • Full bun test: stash-baseline identical (1063 pass, 0 new failures); remaining failures are pre-existing local DB/Valkey-absent suites. Isolated CI runner is authoritative.

Closes #232

Summary by CodeRabbit

  • Documentation

    • Enhanced observability documentation to detail webhook delivery deduplication events and their logging behavior.
  • Tests

    • Added comprehensive test coverage for webhook idempotency event logging validation.

…mDelivery outcomes

claimDelivery had four terminal outcomes but only one emitted a queryable
event, and the claim-won happy path was unlogged, so dedup-hit-rate and
fail-open-rate were not queryable. Add src/webhook/idempotency-log-fields.ts
(strict z.discriminatedUnion, mirrors retry-log-fields) and emit a uniform
event on all four outcomes: idempotency.claimed (new), duplicate_skipped
(replaces the dedup-skip literal), and failed_open with reason
unavailable/error. Logging-only: returns, SET command, TTL, and fail-open
semantics unchanged. The emit-site test now parses each captured emit
through the schema to prove emit/schema agreement.

Closes #232

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_015v2bspPF1ZTTG9ZZrbBgA1
Copilot AI review requested due to automatic review settings June 20, 2026 13:40
@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@chrisleekr, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 42 minutes and 52 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ed59b0ae-8e58-484b-8741-8892de17c2dd

📥 Commits

Reviewing files that changed from the base of the PR and between 1c945d3 and 63b2ee7.

📒 Files selected for processing (5)
  • docs/operate/observability.md
  • src/webhook/idempotency-log-fields.ts
  • src/webhook/idempotency.ts
  • test/webhook/idempotency-log-fields.test.ts
  • test/webhook/idempotency.test.ts
📝 Walkthrough

Walkthrough

Adds structured idempotency log events to the claimDelivery webhook function. A new idempotency-log-fields.ts module defines IDEMPOTENCY_LOG_EVENTS constants and a Zod discriminated-union schema for three event types. claimDelivery is updated to emit these structured events at all outcome paths. Schema tests, integration test updates, and observability documentation are included.

Changes

Structured idempotency log events

Layer / File(s) Summary
Idempotency log-field schema and constants
src/webhook/idempotency-log-fields.ts, test/webhook/idempotency-log-fields.test.ts
New module exports IDEMPOTENCY_LOG_EVENTS string constants and IdempotencyLogFieldsSchema as a strict Zod discriminated union on event, with per-branch required/optional fields for claimed, duplicate_skipped, and failed_open. Tests assert the constant values and exercise schema acceptance and rejection for all branches.
claimDelivery structured event emission
src/webhook/idempotency.ts, test/webhook/idempotency.test.ts
Imports IDEMPOTENCY_LOG_EVENTS and replaces all log payloads in claimDelivery: the Valkey-unavailable path emits reason: "unavailable", the SET NX EX result now logs claimed or duplicateSkipped events with correct return values, and the error catch emits reason: "error" with an err field. Integration tests add a recording logger and an expectEmittedEvent() helper that schema-validates captured log entries.
Observability documentation
docs/operate/observability.md
Adds an "Idempotency log fields" section documenting the three-event family, log levels, and per-event fields (deliveryId, reason, err).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

type: feature ✨, type: docs 📋

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the primary change: adding structured idempotency events across all claimDelivery outcomes for observability.
Linked Issues check ✅ Passed The PR fully addresses all linked issue objectives: structured events for all four outcomes, reason discriminator for failed_open, Zod schema file following established patterns, comprehensive test coverage, and documentation updates.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the linked issue: idempotency event definitions, schema enforcement, test coverage, and documentation. No unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/operate/observability.md`:
- Line 116: The documentation for `claimDelivery` states it returns `true`
exactly once per `deliveryId`, but this claim does not account for fail-open
behavior. When Valkey is unavailable or erroring, the `idempotency.failed_open`
path can also return `true` for redeliveries, violating strict exactly-once
semantics. Reword the sentence describing `claimDelivery` to clarify that the
exactly-once guarantee applies only to the healthy Valkey claim path, and
explicitly note that fail-open scenarios allow multiple `true` returns for the
same `deliveryId` across redeliveries.

In `@src/webhook/idempotency-log-fields.ts`:
- Around line 52-57: The current schema using z.strictObject with optional err
field allows err to be present for both reason values ("unavailable" and
"error"), but the documented contract specifies that err should only exist when
reason is "error". Refactor the IDEMPOTENCY_LOG_EVENTS.failedOpen schema to use
z.discriminatedUnion or separate branches based on the reason field so that err
is required/present only for reason: "error" and explicitly forbidden for
reason: "unavailable". Additionally, add test cases that verify the schema
rejects payloads containing reason: "unavailable" with an err field present.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2f896fdd-3905-4038-b3ac-f4eb71bc0688

📥 Commits

Reviewing files that changed from the base of the PR and between f3132f2 and 1c945d3.

📒 Files selected for processing (5)
  • docs/operate/observability.md
  • src/webhook/idempotency-log-fields.ts
  • src/webhook/idempotency.ts
  • test/webhook/idempotency-log-fields.test.ts
  • test/webhook/idempotency.test.ts

Comment thread docs/operate/observability.md Outdated
Comment thread src/webhook/idempotency-log-fields.ts

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a structured, queryable idempotency.* event family for the webhook claimDelivery idempotency gate, closing the observability gap where only the legacy duplicate-skip path was tagged.

Changes:

  • Introduces a Zod-backed schema + canonical event constants for claimDelivery log fields (idempotency.*).
  • Emits structured event fields for all terminal outcomes in claimDelivery (claimed, duplicate_skipped, failed_open w/ reason).
  • Adds schema drift tests and updates idempotency tests to record and validate emitted log fields; documents the new event family.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/webhook/idempotency.ts Emits structured idempotency.* events across claim outcomes.
src/webhook/idempotency-log-fields.ts Defines canonical event strings and a strict Zod schema for emitted log fields.
test/webhook/idempotency.test.ts Switches to a recording logger and adds assertions around emitted events.
test/webhook/idempotency-log-fields.test.ts Adds acceptance/rejection tests to pin the schema + event literals.
docs/operate/observability.md Documents the new idempotency event family, fields, and levels.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/webhook/idempotency.ts
Comment thread src/webhook/idempotency-log-fields.ts
Comment thread test/webhook/idempotency.test.ts
Comment thread src/webhook/idempotency.ts
Comment thread src/webhook/idempotency.ts
Comment thread test/webhook/idempotency.test.ts
Comment thread test/webhook/idempotency-log-fields.test.ts
Address review: enforce err-only-on-error in the failed_open schema via
separate reason branches (rejecting err on the unavailable path) and clarify
that claimDelivery's exactly-once guarantee holds only on the healthy Valkey
path, degrading to at-least-once when it fails open.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_015v2bspPF1ZTTG9ZZrbBgA1
@chrisleekr

Copy link
Copy Markdown
Owner Author

Both review findings addressed in cad2db6:

  1. docs/operate/observability.md exactly-once wording — reworded to clarify the exactly-once guarantee holds only on the healthy Valkey path; the two fail-open paths return true for redeliveries too, degrading to at-least-once (matching the documented fail-open behaviour).
  2. idempotency-log-fields.ts err-only-on-error — replaced the reason enum + optional err on the failed_open branch with two separate strict branches (reason: "unavailable" forbids err; reason: "error" requires it), so the schema itself enforces the contract. Added tests asserting unavailable + err and error without err are both rejected.

Gates green: 18 schema/emit tests pass, typecheck/lint/format/docs:build clean.

… every emit

Address review: idempotency.claimed fires once per non-duplicate delivery,
so emit it at debug (issue #232 volume policy, mirroring github.api.request)
rather than info. The recording-logger test now parses every captured emit
through IdempotencyLogFieldsSchema at capture time, so any emit/schema drift
surfaces immediately, not only on lines a test asserts on.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_015v2bspPF1ZTTG9ZZrbBgA1
@chrisleekr chrisleekr merged commit e1e7f9e into main Jun 20, 2026
15 checks passed
@chrisleekr chrisleekr deleted the fix/issue-232 branch June 20, 2026 14:02
chrisleekr added a commit that referenced this pull request Jun 25, 2026
…trict schemas

Bundles 12 area:observability issues into one change. Each family ships a
Zod-strict *-log-fields.ts module with a co-located drift-prevention test,
following the #242/#244 pattern: structured_output, circuit, digest,
github.api.slow, github.app.token.mint, http, scheduler.scan, workflow.run,
workspace, agent.tool, daemon.connection, k8s.spawn.

Behavioral changes beyond logging: #236 routes 6 mint sites through a new
mintInstallationToken helper (exact cache_hit via scoped hook.before); #218
removes unused onConnected/onDisconnected WsClientOptions callbacks; #223
replaces octokit hook.after/hook.error with one hook.wrap timing closure
(new GITHUB_API_SLOW_REQUEST_MS env, default 3000).

Security: errSerializer now strips octokit event/payload/signature carriers so
the raw webhook body + HMAC signature cannot leak through err: log lines
(regression-tested).

Docs: 12 observability.md sections + alerts, configuration.md env var,
daemon-fleet.md k8s.spawn diagnostics.

Closes #216 #217 #218 #223 #228 #233 #234 #235 #236 #237 #243 #247

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Wjb51MuzTGGiJ2ggGX2DhH
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(observability): structured idempotency.* events for claimDelivery outcomes to make dedup-hit and fail-open rates queryable

2 participants